iT邦幫忙

2024 iThome 鐵人賽

DAY 29
0

題目

Questions

Q33

A company has a frontend ReactJS website that uses Amazon API Gateway to invoke REST APIs. The APIs perform the functionality of the website. A data engineer needs to write a Python script that can be occasionally invoked through API Gateway. The code must return results to API Gateway. Which solution will meet these requirements with the LEAST operational overhead?

  • [ ] A. Deploy a custom Python script on an Amazon Elastic Container Service (Amazon ECS) cluster.
  • [x] B. Create an AWS Lambda Python function with provisioned concurrency.
  • [ ] C. Deploy a custom Python script that can integrate with API Gateway on Amazon Elastic Kubernetes Service (Amazon EKS).
  • [ ] D. Create an AWS Lambda function. Ensure that the function is warm by scheduling an Amazon EventBridge rule to invoke the Lambda function every 5 minutes by using mock events.

描述

  • 一公司前端用 ReactJS 做網站,接 Amazon API Gateway 做 REST APIs
  • API 都是用來提供網站功能的
  • 工程師寫 Python scripts 並由 REST API 去呼叫那些 Python scripts
  • 選維運成本低

解析

  • A 和 C 選項,都要蓋出容器叢集
  • B 選項不用蓋叢集,單純要執行 Python 腳本,放在 Lambda 即可
  • D 不符需求

Q34

A company has a production AWS account that runs company workloads. The company's security team created a security AWS account to store and analyze security logs from the production AWS account. The security logs in the production AWS account are stored in Amazon CloudWatch Logs. The company needs to use Amazon Kinesis Data Streams to deliver the security logs to the security AWS account. Which solution will meet these requirements?

  • [ ] A. Create a destination data stream in the production AWS account. In the security AWS account, create an IAM role that has cross-account permissions to Kinesis Data Streams in the production AWS account.
  • [ ] B. Create a destination data stream in the security AWS account. Create an IAM role and a trust policy to grant CloudWatch Logs the permission to put data into the stream. Create a subscription filter in the security AWS account.
  • [ ] C. Create a destination data stream in the production AWS account. In the production AWS account, create an IAM role that has cross-account permissions to Kinesis Data Streams in the security AWS account.
  • [x] D. Create a destination data stream in the security AWS account. Create an IAM role and a trust policy to grant CloudWatch Logs the permission to put data into the stream. Create a subscription filter in the production AWS account.

描述

  • 有一公司有個 AWS 帳號,拿來建立 Prodction 環境
  • 資安部門用另外一個 AWS 帳號建立分析 log 的資源
  • Security log 存在 Prodction 的 CloudWatch Logs
  • 用 KDS (Kinesis Data Stream) 把 Log 扔到資安部門的帳號環境中

解析

  • 接收端開 Data Stream 資源,所以開在資安部門帳號
  • 發送端要有權限,所以要讓 Production 帳號的 CloudWatch 有資格去放資料到 Stream

Q35

A company uses Amazon S3 to store semi-structured data in a transactional data lake. Some of the data files are small, but other data files are tens of terabytes. A data engineer must perform a change data capture (CDC) operation to identify changed data from the data source. The data source sends a full snapshot as a JSON file every day and ingests the changed data into the data lake. Which solution will capture the changed data MOST cost-effectively?

  • [ ] A. Create an AWS Lambda function to identify the changes between the previous data and the current data. Configure the Lambda function to ingest the changes into the data lake.
  • [ ] B. Ingest the data into Amazon RDS for MySQL. Use AWS Database Migration Service (AWS DMS) to write the changed data to the data lake.
  • [x] C. Use an open source data lake format to merge the data source with the S3 data lake to insert the new data and update the existing data.
  • [ ] D. Ingest the data into an Amazon Aurora MySQL DB instance that runs Aurora Serverless. Use AWS Database Migration Service (AWS DMS) to write the changed data to the data lake.

描述

  • 一公司用 S3 資料湖 存半結構化的交易資料
  • 檔案大小落差大
  • 工程師想要做 CDC (change data capture,意即偵測到資料有變化的事件被捕捉)
  • 寄送完整快照,寫成 JSON
  • 選最划算

解析

  • 避免無謂的資料庫被建立,不會是 B D
  • 用 A 會頻繁呼叫 AWS Lambda Function 所以 C 更省

Q36

A data engineer runs Amazon Athena queries on data that is in an Amazon S3 bucket. The Athena queries use AWS Glue Data Catalog as a metadata table. The data engineer notices that the Athena query plans are experiencing a performance bottleneck. The data engineer determines that the cause of the performance bottleneck is the large number of partitions that are in the S3 bucket. The data engineer must resolve the performance bottleneck and reduce Athena query planning time. Which solutions will meet these requirements? (Choose two.)

  • [x] A. Create an AWS Glue partition index. Enable partition filtering.
  • [ ] B. Bucket the data based on a column that the data have in common in a WHERE clause of the user query.
  • [x] C. Use Athena partition projection based on the S3 bucket prefix.
  • [ ] D. Transform the data that is in the S3 bucket to Apache Parquet format.
  • [ ] E. Use the Amazon EMR S3DistCP utility to combine smaller objects in the S3 bucket into larger objects.

描述

  • 工程師用 Amazon Athena 查詢 S3 中的資料
  • 以 AWS Glue Data Catalog 做 metadata table
  • 有瓶頸,在 S3 分太多片段
  • 希望提高效能

解析


上一篇
【Day 28】 做題庫小試身手 - 9
下一篇
【Day 29】 做題庫小試身手 - 11
系列文
老闆,外帶一份 AWS Certified Data Engineer30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言